Reverberation via Convolution

Reverberation

Reverberation refers to the continuation of a sound after the source of the sound is no longer producing sound. Usually reverberation features meany decaying echoes from the sound source, with the echoes each decaying due to absorption in materials around the sound source. Imagine the resulting sound from clapping in an empty lecture theatre, or an empty concert hall -- what you experience is reverberation. The goal of this notebook is to implement reverberation on an audio sample.

Digital Reverberation

There are, in general, two types of digital reverb: Algorithmic reverb (implementing reverb with many filters and the simplest digital reverberator consists of 4 parallel comb filters and two series all-pass filters made up of comb filters); and convolution reverb.

Implementing a quality algorithmic reverberation is quite difficult. To get a sense for how complicated the filtering networks are in such reverberators check out the circuit on the last page of this paper. That reverberator is considerered the simplest digital reverberator, and it was the first one implemented. It's quality is also not very good at all and is incredibly metallic sounding -- it sounds more like passing audio through a tube than a clean reverb. The reason algorithmic reverb is hard to implement is because of how complicated physical reverb is. To mimic all the physics involved in reverb: Absorbtion, reflection, high echo densities, would take a very complex algorithm to properly imitate with linear filters.

With all of that being said, the purpose of this notebook is to create a reverb program which applies a clean reverb onto an audio sample via convolution reverb. Convolution based reverberation simulates physical reverberation via convolution with the impulse response of an audio reflective room. By investigating how the room responds to all frequencies, we can make (theoretically) any audio sample sound like it was played in that room via convolution with the impulse response for the room. For this project, I'm very thankful to researchers at the University of Helsinki for making some concert hall impulse responses freely available. There is also another free source of responses at echothief. This repository of impulse responses features caves, train stations, and even the response from an abandonded coastal artillery battery.

If $h[n]$ is the impulse response of a given room, then apply the reverberation is incredibly simple. If $x[n]$ is our input audio sample, then the reverbed audio, $y[n]$ is given by

$$ y[n] = h[n] * x[n] = \sum_{k=-\infty}^{\infty} h[k] x[n-k]$$

To speed this process up we implement the convolution with frequency domain multiplication, therefore we will use the scipy fftconvolve function to implement the convolution.

The impulse response we will use in this notebook is from the University of Helsinki website linked above, and it is an omnidirectional response from the audience area. The notebook assumes a stereo float 32bit encoded .wav file as input. If you have an audio sample that you want to try applying reverb to with this code use software like audacity to re-encode the .wav file.

The reverb obtained from this impulse response is quite clean. In order to fill out the audio we can simply add the original audio to the reverbed audio at a given ratio get a nice sounding result.

It has recently become popular to not only reverb a song but to slow it down as well. We can create a slowing factor $\alpha$ and simply undersample the song while writing to slow the song down.

Note: It's advisable when undersampling to make sure the original sampling rate of the song is quite high. Usually songs have a project rate of 44100kHz, when undersampling I'd recommend using software like Audacity (linked above) to increase the project rate to twice this or even more depending on how much slowing you want to do. This is because we want to ensure that we're adequetly representing the entire frequency content of the song when undersampling, so having even more closely spaced data to begin with ensures that the band of frequency we can faithfully reconstruct is still the range of human hearing.

For reference the original .wav file is played below: